1,724 research outputs found

    Estimating Effects and Making Predictions from Genome-Wide Marker Data

    Full text link
    In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called ``winner's curse.'' We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.Comment: Published in at http://dx.doi.org/10.1214/09-STS306 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates

    Get PDF
    Array based DNA pooling techniques facilitate genome-wide scale genotyping of large samples. We describe a structured analysis method for pooled data using internal replication information in large scale genotyping sets. The method takes advantage of information from single nucleotide polymorphisms (SNPs) typed in parallel on a high density array to construct a test statistic with desirable statistical properties. We utilize a general linear model to appropriately account for the structured multiple measurements available with array data. The method does not require the use of additional arrays for the estimation of unequal hybridization rates and hence scales readily to accommodate arrays with several hundred thousand SNPs. Tests for differences between cases and controls can be conducted with very few arrays. We demonstrate the method on 384 endometriosis cases and controls, typed using Affymetrix Genechip© HindIII 50 K arrays. For a subset of this data there were accurate measures of hybridization rates available. Assuming equal hybridization rates is shown to have a negligible effect upon the results. With a total of only six arrays, the method extracted one-third of the information (in terms of equivalent sample size) available with individual genotyping (requiring 768 arrays). With 20 arrays (10 for cases, 10 for controls), over half of the information could be extracted from this sample

    Population genetic differentiation of height and body mass index across Europe

    Get PDF
    Across-nation differences in the mean of complex traits such as obesity and stature are common1–8, but the reasons for these differences are not known. Here, we find evidence that many independent loci of small effect combine to create population genetic differences in height and body mass index (BMI) in a sample of 9,416 individuals across 14 European countries. Using discovery data on over 250,000 individuals and unbiased estimates of effect sizes from 17,500 sib pairs, we estimate that 24% (95% CI: 9%, 41%) and 8% (95% CI: 4%, 16%) of the captured additive genetic variance for height and BMI across Europe are attributed to among-population genetic differences. Population genetic divergence differed significantly from that expected under a null model (P <3.94e−08 for height and P<5.95e−04 for BMI), and we find an among-population genetic correlation for tall and slender nations (r = −0.80 (95% CI: −0.95, −0.60), contrasting no genetic correlation between height and BMI within populations (r = −0.016, 95% CI: −0.041, 0.001), consistent with selection on height genes that also act to reduce BMI. Observations of mean height across nations correlated with the predicted genetic means for height (r = 0.51, P<0.001), so that a proportion of observed differences in height within Europe reflect genetic factors. In contrast, observed mean BMI did not correlate with the genetic estimates (P<0.58), implying that genetic differentiation in BMI is masked by environmental differences across Europe

    QTL detection and allelic effects for growth and fat traits in outbred pig populations

    Get PDF
    Quantitative trait loci (QTL) for growth and fatness traits have previously been identified on chromosomes 4 and 7 in several experimental pig populations. The segregation of these QTL in commercial pigs was studied in a sample of 2713 animals from five different populations. Variance component analysis (VCA) using a marker-based identity by descent (IBD) matrix was applied. The IBD coefficient was estimated with simple deterministic (SMD) and Markov chain Monte Carlo (MCMC) methods. Data for two growth traits, average daily gain on test and whole life daily gain, and back fat thickness were analysed. With both methods, seven out of 26 combinations of population, chromosome and trait, were significant. Additionally, QTL genotypic and allelic effects were estimated when the QTL effect was significant. The range of QTL genotypic effects in a population varied from 4.8% to 10.9% of the phenotypic mean for growth traits and 7.9% to 19.5% for back fat trait. Heritabilities of the QTL genotypic values ranged from 8.6% to 18.2% for growth traits, and 14.5% to 19.2% for back fat. Very similar results were obtained with both SMD and MCMC. However, the MCMC method required a large number of iterations, and hence computation time, especially when the QTL test position was close to the marker

    Prediction of individual genetic risk to disease from genome-wide association studies

    Get PDF
    Empirical studies suggest that the effect sizes of individual causal risk alleles underlying complex genetic diseases are small, with most genotype relative risks in the range of 1.1-2.0. Although the increased risk of disease for a carrier is small for any single locus, knowledge of multiple-risk alleles throughout the genome could allow the identification of individuals that are at high risk. In this study, we investigate the number and effect size of risk loci that underlie complex disease constrained by the disease parameters of prevalence and heritability. Then we quantify the value of prediction of genetic risk to disease using a range of realistic combinations of the number, size, and distribution of risk effects that underlie complex diseases. We propose an approach to assess the genetic risk of a disease in healthy individuals, based on dense genome-wide SNP panels. We test this approach using simulation. When the number of loci contributing to the disease is >50, a large case-control study is needed to identify a set of risk loci for use in predicting the disease risk of healthy people not included in the case-control study. For diseases controlled by 1000 loci of mean relative risk of only 1.04, a case-control study with 10,000 cases and controls can lead to selection of ∼75 loci that explain >50% of the genetic variance. The 5% of people with the highest predicted risk are three to seven times more likely to suffer the disease than the population average, depending on heritability and disease prevalence. Whether an individual with known genetic risk develops the disease depends on known and unknown environmental factors

    Genetic architecture of body size in mammals

    Get PDF
    Much of the heritability for human stature is caused by mutations of small-to-medium effect. This is because detrimental pleiotropy restricts large-effect mutations to very low frequencies

    Explaining additional genetic variation in complex traits

    Get PDF
    Genome-wide association studies (GWAS) have provided valuable insights into the genetic basis of complex traits, discovering >6000 variants associated with >500 quantitative traits and common complex diseases in humans. The associations identified so far represent only a fraction of those that influence phenotype, because there are likely to be many variants across the entire frequency spectrum, each of which influences multiple traits, with only a small average contribution to the phenotypic variance. This presents a considerable challenge to further dissection of the remaining unexplained genetic variance within populations, which limits our ability to predict disease risk, identify new drug targets, improve and maintain food sources, and understand natural diversity. This challenge will be met within the current framework through larger sample size, better phenotyping, including recording of nongenetic risk factors, focused study designs, and an integration of multiple sources of phenotypic and genetic information. The current evidence supports the application of quantitative genetic approaches, and we argue that one should retain simpler theories until simplicity can be traded for greater explanatory power

    Large-scale genomics unveils the genetic architecture of psychiatric disorders

    Get PDF
    Family study results are consistent with genetic effects making substantial contributions to risk of psychiatric disorders such as schizophrenia, yet robust identification of specific genetic variants that explain variation in population risk had been disappointing until the advent of technologies that assay the entire genome in large samples. We highlight recent progress that has led to a better understanding of the number of risk variants in the population and the interaction of allele frequency and effect size. The emerging genetic architecture implies a large number of contributing loci (that is, a high genome-wide mutational target) and suggests that genetic risk of psychiatric disorders involves the combined effects of many common variants of small effect, as well as rare and de novo variants of large effect. The capture of a substantial proportion of genetic risk facilitates new study designs to investigate the combined effects of genes and the environment

    Detection of multiple quantitative trait loci and their pleiotropic effects in outbred pig populations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Simultaneous detection of multiple QTLs (quantitative trait loci) may allow more accurate estimation of genetic effects. We have analyzed outbred commercial pig populations with different single and multiple models to clarify their genetic properties and in addition, we have investigated pleiotropy among growth and obesity traits based on allelic correlation within a gamete.</p> <p>Methods</p> <p>Three closed populations, (A) 427 individuals from a Yorkshire and Large White synthetic breed, (B) 547 Large White individuals and (C) 531 Large White individuals, were analyzed using a variance component method with one-QTL and two-QTL models. Six markers on chromosome 4 and five to seven markers on chromosome 7 were used.</p> <p>Results</p> <p>Population A displayed a high test statistic for the fat trait when applying the two-QTL model with two positions on two chromosomes. The estimated heritabilities for polygenic effects and for the first and second QTL were 19%, 17% and 21%, respectively. The high correlation of the estimated allelic effect on the same gamete and QTL test statistics suggested that the two separate QTL which were detected on different chromosomes both have pleiotropic effects on the two fat traits. Analysis of population B using the one-QTL model for three fat traits found a similar peak position on chromosome 7. Allelic effects of three fat traits from the same gamete were highly correlated suggesting the presence of a pleiotropic QTL. In population C, three growth traits also displayed similar peak positions on chromosome 7 and allelic effects from the same gamete were correlated.</p> <p>Conclusion</p> <p>Detection of the second QTL in a model reduced the polygenic heritability and should improve accuracy of estimated heritabilities for both QTLs.</p
    corecore